552 research outputs found

    Faster Rates for Policy Learning

    Full text link
    This article improves the existing proven rates of regret decay in optimal policy estimation. We give a margin-free result showing that the regret decay for estimating a within-class optimal policy is second-order for empirical risk minimizers over Donsker classes, with regret decaying at a faster rate than the standard error of an efficient estimator of the value of an optimal policy. We also give a result from the classification literature that shows that faster regret decay is possible via plug-in estimation provided a margin condition holds. Four examples are considered. In these examples, the regret is expressed in terms of either the mean value or the median value; the number of possible actions is either two or finitely many; and the sampling scheme is either independent and identically distributed or sequential, where the latter represents a contextual bandit sampling scheme

    Asymptotically Optimal Algorithms for Budgeted Multiple Play Bandits

    Get PDF
    We study a generalization of the multi-armed bandit problem with multiple plays where there is a cost associated with pulling each arm and the agent has a budget at each time that dictates how much she can expect to spend. We derive an asymptotic regret lower bound for any uniformly efficient algorithm in our setting. We then study a variant of Thompson sampling for Bernoulli rewards and a variant of KL-UCB for both single-parameter exponential families and bounded, finitely supported rewards. We show these algorithms are asymptotically optimal, both in rateand leading problem-dependent constants, including in the thick margin setting where multiple arms fall on the decision boundary

    Statistical inference for the mean outcome under a possibly non-unique optimal treatment strategy

    Full text link
    We consider challenges that arise in the estimation of the mean outcome under an optimal individualized treatment strategy defined as the treatment rule that maximizes the population mean outcome, where the candidate treatment rules are restricted to depend on baseline covariates. We prove a necessary and sufficient condition for the pathwise differentiability of the optimal value, a key condition needed to develop a regular and asymptotically linear (RAL) estimator of the optimal value. The stated condition is slightly more general than the previous condition implied in the literature. We then describe an approach to obtain root-nn rate confidence intervals for the optimal value even when the parameter is not pathwise differentiable. We provide conditions under which our estimator is RAL and asymptotically efficient when the mean outcome is pathwise differentiable. We also outline an extension of our approach to a multiple time point problem. All of our results are supported by simulations.Comment: Published at http://dx.doi.org/10.1214/15-AOS1384 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Evaluating the Impact of Treating the Optimal Subgroup

    Get PDF
    Suppose we have a binary treatment used to influence an outcome. Given data from an observational or controlled study, we wish to determine whether or not there exists some subset of observed covariates in which the treatment is more effective than the standard practice of no treatment. Furthermore, we wish to quantify the improvement in population mean outcome that will be seen if this subgroup receives treatment and the rest of the population remains untreated. We show that this problem is surprisingly challenging given how often it is an (at least implicit) study objective. Blindly applying standard techniques fails to yield any apparent asymptotic results, while using existing techniques to confront the non-regularity does not necessarily help at distributions where there is no treatment effect. Here we describe an approach to estimate the impact of treating the subgroup which benefits from treatment that is valid in a nonparametric model and is able to deal with the case where there is no treatment effect. The approach is a slight modification of an approach that recently appeared in the individualized medicine literature

    Optimal Methods for Using Posterior Probabilities in Association Testing

    Get PDF
    Objective: The use of haplotypes to impute the genotypes of unmeasured single nucleotide variants continues to rise in popularity. Simulation results suggest that the use of the dosage as a one-dimensional summary statistic of imputation posterior probabilities may be optimal both in terms of statistical power and computational efficiency; however, little theoretical understanding is available to explain and unify these simulation results. In our analysis, we provide a theoretical foundation for the use of the dosage as a one-dimensional summary statistic of genotype posterior probabilities from any technology. Methods: We analytically evaluate the dosage, mode and the more general set of all one-dimensional summary statistics of two-dimensional (three posterior probabilities that must sum to 1) genotype posterior probability vectors. Results: We prove that the dosage is an optimal one-dimensional summary statistic under a typical linear disease model and is robust to violations of this model. Simulation results confirm our theoretical findings. Conclusions: Our analysis provides a strong theoretical basis for the use of the dosage as a one-dimensional summary statistic of genotype posterior probability vectors in related tests of genetic association across a wide variety of genetic disease models

    Immigration and National Security in the UAE: Between Realism & Constructivism

    Get PDF
    This research project empirically analyses the impact of immigration (and related issues like citizenship and demographics) on national security in the UAE. We identify a range of security threats hypothesized to arise from immigration, whether perceived or real , and test them in the current Emirati context. We base our analyses on 30 interviews of Emirati citizens with significant access to decision-making venues and/or reflective expertise on the issue, to gain a cross-section of current Emirati thinking on security threats stemming from immigratio

    Statistical Inference for the Mean Outcome Under a Possibly Non-Unique Optimal Treatment Strategy

    Get PDF
    We consider challenges that arise in the estimation of the value of an optimal individualized treatment strategy defined as the treatment rule that maximizes the population mean outcome, where the candidate treatment rules are restricted to depend on baseline covariates. We prove a necessary and sufficient condition for the pathwise differentiability of the optimal value, a key condition needed to develop a regular asymptotically linear (RAL) estimator of this parameter. The stated condition is slightly more general than the previous condition implied in the literature. We then describe an approach to obtain root-n rate confidence intervals for the optimal value even when the parameter is not pathwise differentiable. In particular, we develop an estimator that, when properly standardized, converges to a normal limiting distribution. We provide conditions under which our estimator is RAL and asymptotically efficient when the mean outcome is pathwise differentiable. We outline an extension of our approach to a multiple time point problem in the appendix. All of our results are supported by simulations
    • …
    corecore